Featured Article : Safety Considerations Around ChatGPT Image Uploads

Written by: Paul |

With one of ChatGPT’s latest features being the ability to upload images to help get answers to queries, here we look at why there have been security concerns about releasing the feature. 

Update To ChatGPT 

The new ‘Image input’ which will soon be generally available to Plus users on all platforms, has just been announced along with a voice capability, enabling users to have a voice conversation with ChatGPT, and the ‘Browse’ feature that enables the chatbot to browse the internet to get current information. 

ChatGPT and Other Chatbot Limitations and Concerns 

Prior to the latest concerns about the new ‘Image input’ feature, several concerns limitations about ChatGPT have been highlighted. 

For example, ChatGPT’s CEO Sam Altman has long been clear about the possibility that the chatbot is capable of making things up in a kind of “hallucination” in reply to questions. Also, there’s a clear warning on the foot of the ChatGPT’s user account page confirming this saying: “ChatGPT may produce inaccurate information about people, places, or facts.”

Also, back in March, the UK’s National Cyber Security Centre (NCSC) published warnings that LLMs (the language models powering AI chatbots) can: 

– Get things wrong and ‘hallucinate’ incorrect facts. 

– Display bias and be “gullible” (in responding to leading questions, for example). 

– Be “coaxed into creating toxic content and are prone to injection attacks.” 

For these and other reasons, the NCSC recommends not including sensitive information in queries to public LLMs, and not to submit queries to public LLMs that would lead to issues (if they were they made public). 

It’s within this context of the recognised and documented imperfections of chatbots that we look at the risks that a new image dimension could present. 

Image Input  

The new ‘Image input’ feature for ChatGPT, which had already been introduced by Google’s Bard, is intended to facilitate the usage the contents of images to better explain their questions, help troubleshoot, or for instance get an explanation of complex graph, or to generate other helpful responses based on the picture. In fact, it’s intended to act in situations (just as in real life), where it may be quicker and more effective to show something as picture of something rather than try and explain it. ChatGPT’s powerful image recognition abilities means that it can describe what’s in the uploaded images, answer questions about them and, even recognise specific people’s faces.

ChatGPT’s ‘Image input’ feature owes much to a collaboration (in March) between OpenAI and the ‘Be My Eyes’ platform which led to the creation of ‘Be My AI’, a new tool to describe the visual world for people who are blind or have low vision. In essence, the Be My Eyes Platform seems to have provided an ideal testing area to inform how GPT-4V could be deployed responsibly. 

How To Use It 

The new Image input feature allows users to tap on the photo button to capture or choose an image, and to show/upload one or more images to ChatGPT, and even to using a drawing tool in the mobile app to focus on a specific part of an image. 

Concerns About Image Input 

Although it’s obvious to see how Image input could be helpful, it’s been reported that OpenAI was reluctant to release GPT-4V / GPT-4 with ‘vision’ because of privacy issues over its facial recognition abilities, and over what it may ‘say’ about peoples’ faces.

Testing 

Open AI says that before releasing Image input, its “Red teamers” tested it relation to how it performed on areas of concern. These areas for testing give a good idea of the kinds of concerns about how Image input, a totally new vector for ChatGPT, could provide the wrong response or be manipulated. 

For example, OpenAI says its teams tested the new feature in areas including scientific proficiency, medical advice, stereotyping and ungrounded inferences, disinformation risks, hateful content, and visual vulnerabilities. It also looked at its performance in areas like sensitive trait attribution across demographics (images of people for gender, age, and race recognition), person identification, ungrounded inference evaluation (inferences that are not justified by the information the user has provided), jailbreak evaluations (prompts that circumvent the safety systems in place to prevent malicious misuse), advice or encouragement for self-harm behaviours, and graphic material, CAPTCHA breaking and geolocation. 

Concerns 

Following its testing, some of the concerns highlighted about the ‘vision’ aspect of ChatGPT in tests by Open AI, as detailed in its own September 25 technical paper include: 

– Where “Hateful content” in images is concerned, GPT-4V was found to refuse to answer questions about hate symbols and extremist content in some instances but not all. For example, it can’t always recognise lesser-known hate group symbols. 

– It shouldn’t be relied upon for accurate identifications for issues such as medical, or scientific analysis.

– In relation to stereotyping and ungrounded inferences, using GPT-4V for some tasks could generate unwanted or harmful assumptions that are not grounded in the information provided to the model. 

Other Security, Privacy, And Legal Concerns 

OpenAI’s own assessments aside, major concerns raised by tech and security commentors about ChatGPT’s facial recognition capabilities in relation to the Image input feature are that: 

– It could be used as a facial recognition tool by malicious actors. For example, it could be used in some way in conjunction with WormGPT, the AI chatbot trained on malware and designed to extort victims or used generally in identity fraud scams. 

– It could say things about faces that that provide unsafe assessments, e.g. about their gender or emotional state.

– Its LLM risks producing incorrect results in potentially risky areas, such as identifying illegal drugs or safe-to-eat mushrooms and plants.

– The GPT-4V model may (as with the text version) give responses (both text and images) that could be used by some bad-actors to spread disinformation at scale. 

– In Europe (operating under GDPR) it could cause legal issues, i.e. citizen consent is required to use their biometric data. 

What Does This Mean For Your Business? 

This could be a legal minefield for OpenAI and may even pose risks to users, as OpenAI’s many testing categories show. It us unsurprising that OpenAI held back on the release of GPT-4V (GPT-4 with vision) over safety and privacy issues, e.g. in its facial recognition capabilities.

Certainly, adding new modalities like image inputs into LLMs expands the impact of language-only systems with new interfaces and capabilities, enabling the solving of new tasks and providing novel experiences for users, yet it’s hard to ignore the risks of facial recognition being abused. OpenAI has, of course, ‘red teamed’, tested, and introduced refusals and blocks where it can but, as is publicly known and admitted by OpenAI and others, chatbots are imperfect, still in their early stages of development, and are certainly capable of producing wrong (and potentially damaging) responses, while there are legal matters like consent (facial images are personal data) to consider.

The fact that a malicious version of ChatGPT has already been produced and circulated by criminals has highlighted concerns about threats posed by the technology and how an image aspect could elevate this threat in some way. Biometric data is now being used as a verification for devices, services, and accounts, and with convincing deepfake technology already being used, we don’t yet know what inventive ways cyber criminals could use image inputs in chatbots as part of a new landscape of scams. 

It’s a fast-moving competitive market, however, as the big tech companies race to make their own chatbots as popular as possible and despite OpenAI’s initial reluctance, in order to stay competitive, it may have felt some pressure to get its image input feature out there now. The functionalities introduced recently to ChatGPT (such as image input) illustrate the fact that to make chatbots more useful and competitive, some lines must be crossed however tentatively, even though this could increase risks to users and to companies like OpenAI.